首页> 外文OA文献 >Effective methods and strategies for massive small files processing based on Hadoop
【2h】

Effective methods and strategies for massive small files processing based on Hadoop

机译:基于Hadoop的海量小文件处理的有效方法和策略

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The Hadoop framework provides a powerful way to handle Big Data. Since Hadoop has inherent defects of high memory overhead and low computing performance in processing massive small files, we implement three methods and propose two strategies for solving small files problem in this paper. First, we implement three methods, i.e., Hadoop Archives (HAR), Sequence Files (SF) and CombineFileInputFormat (CFIF), to compensate the existing defects of Hadoop. Moreover, we propose two strategies for meeting the actual needs of different users. Finally, we evaluate the efficiency of the implemented methods and the validity of the proposed strategies. The experimental results show that our methods and strategies can improve the efficiency of massive small files processing, thereby enhancing the overall performance of Hadoop. © 2014 ISSN 1881-803X.
机译:Hadoop框架提供了一种处理大数据的强大方法。由于Hadoop在处理海量小文件时具有内存高,计算性能低的固有缺陷,因此我们实现了三种方法,并提出了两种解决小文件问题的策略。首先,我们实现了三种方法,即Hadoop存档(HAR),序列文件(SF)和CombineFileInputFormat(CFIF),以弥补Hadoop的现有缺陷。此外,我们提出了两种策略来满足不同用户的实际需求。最后,我们评估了所实施方法的效率以及所提出策略的有效性。实验结果表明,我们的方法和策略可以提高海量小文件处理的效率,从而提高Hadoop的整体性能。 ©2014 ISSN 1881-803X。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号